class: center, middle, inverse, title-slide .title[ # Building interactive maps using Leaflet in R ] .author[ ### Kate Pyper ] --- ## This Session By the end of this session you'll be able to build maps like this:
--- ## This Session During this section we will work through how to build a map in R using Leaflet (~ 1 hour) followed by a practice task for you to work on in small groups supported by the Geospatial Cross Team (~ 1 hour). In the first hour we will cover: - Reading in and manipulating shapefiles - Adding Base maps - Adding areas and point markers (with legends) - Adding and toggling layers Before joining this session you should have installed the geospatial packages on Posit, and also set up your .Rprofile to run the geospatial set up code. --- class: center, middle, inverse # Reading in and manipulating shapefiles --- # What are shapefiles? Shapefiles contain information on how locations look on a map. This is usually some set of coordinates that make up the boundaries of an area or set of areas. The shapefile is actually multiple files *with the same name* that link together to form a full picture of what the area looks like. For R the key files *needed at minimum* to construct a spatial object are: - `.shp` - this file contains the boundary definitions for a set of areas - `.dbf` - this file contains additional data relating to the same set of areas as in the `.shp` file - `.shx` - this is an index file which allows searching of the `.shp` file Shapefiles are available for a variety of area types in Posit in the `Shapefiles` sub folder of: ``` r geo_dir <- "/conf/linkage/output/lookups/Unicode/Geography/" ``` These shapefiles contain optional files which provides richer spatial information (`.prj`) and faster loading. --- # Reading in spatial data The package that is recommended for all things spatial is `{sf}`. `{sf}` treats spatial data in R as a type of data frame called a simple features data frame - the simple features are the (not so simple) definitions of how the data works as a map. To read in a shapefile as a simple features data frame, use the command `read_sf()` and supply the name of the `.shp` file - r will automatically pull in the rest of the files. To read in the locality level shape files: ``` r library(sf) library(readr) library(dplyr) locality_shp <- read_sf(file.path(geo_dir, "Shapefiles/HSCP Locality (Datazone2011 Base)/HSCP_Locality.shp")) ``` This looks like a data frame, and can be treated in generally the same way as a data frame (with some additional quirks!) - The column `geometry` contains the information on the area boundaries which comes from the `.shp` file - The rest of the columns come from information in the `.dbf` file --- # The Coordinate Reference System (CRS) When you read in a shapefile it will have an associated coordinate system in place. you can check what this coordinate system is: ``` r st_crs(locality_shp) ``` ``` ## Coordinate Reference System: ## User input: OSGB 1936 / British National Grid ## wkt: ## PROJCRS["OSGB 1936 / British National Grid", .... ``` Our shapefile is on an ordnance survey grid - usually, we want to have a geography data in latitude longitude format. This corresponds to a coordinate reference system called EPSG:4326. To transform to this coordinate system use `st_transform()`: ``` r locality_shp <- st_transform(locality_shp, crs = 4326) st_crs(locality_shp) ``` ``` ## Coordinate Reference System: ## User input: EPSG:4326 ## wkt: ## GEOGCRS["WGS 84", .... ``` --- # Manipulating spatial data In general you can use a simple features data frame in the same way as any other data frame - including using `{dplyr}` verbs like `filter()`, `mutate()` and `select()` (`select()` will automatically select the `geometry` column) The main operation where you need to be a bit careful is when using joins, as these have an effect on the object type. The rule for these is that *the spatial object has to come first*. When the spatial object comes first spatial attributes stay in place: ``` r locality_lookup <- readr::read_csv(file.path(geo_dir, "/HSCP Locality/HSCP Localities_DZ11_Lookup_20240513.csv")) left_join(locality_shp, locality_lookup, by = join_by(hscp_local == hscp_locality)) ``` ``` ## Simple feature collection with 6976 features and 19 fields ## Geometry type: MULTIPOLYGON ## Dimension: XY ## Bounding box: xmin: -8.650007 ymin: 54.63324 xmax: -0.7246094 ymax: 60.86077 .... ``` When the spatial object comes second they don't: ``` r left_join(locality_lookup, locality_shp, by = join_by(hscp_locality == hscp_local)) ``` ``` ## # A tibble: 6,976 × 20 ## datazone2011 datazone2011name hscp_locality hscp2019name hscp2019 hscp2018 ## <chr> <chr> <chr> <chr> <chr> <chr> ## 1 S01006506 Culter - 01 Aberdeen Sou… Aberdeen Ci… S370000… S370000… .... ``` --- class: center, middle, inverse2 # Building a basic map --- # Base maps One of the benefits to using leaflet is being able to add spatial information on top of a map of the area you are looking at. This is called the base map, and there are a few you can choose from. To add the default base map: ``` r library(leaflet) leaflet() %>% # set up map addTiles() ```
--- # Base maps If you want to add a different base map ([you can view available base maps here](https://leaflet-extras.github.io/leaflet-providers/preview/)): ``` r leaflet() %>% addProviderTiles("Esri.WorldImagery") ```
--- # Markers To add markers to your plot all you need is a set of coordinates containing latitude and longitude. There are a few different types of markers that you can add: - Circle markers - These are equivalent to adding points to a map, can vary size and colour, add labels and pop-ups. - Markers - These are location markers. There is a bit of work involved in customising these. - Awesome Markers - These are location markers that can be customised easily. These will be demonstrated using the GP practice information: ``` r gpprac ``` ``` ## # A tibble: 60 × 17 ## PracticeCode GPPracticeName PracticeListSize AddressLine1 AddressLine2 ## <dbl> <chr> <dbl> <chr> <chr> ## 1 10002 Muirhead Medical Cen… 8274 Muirhead Me… Liff Road .... ``` --- # Circle Markers ``` r leaflet() %>% addTiles() %>% addCircleMarkers(data = gpprac, radius = 1, popup = glue::glue("Practice: {gpprac$GPPracticeName}<br> Cluster: {gpprac$GPCluster}<br> List Size: {gpprac$PracticeListSize}")) ```
--- # Markers ``` r leaflet() %>% addTiles() %>% addMarkers(data = gpprac, popup = glue::glue("Practice: {gpprac$GPPracticeName}<br> Cluster: {gpprac$GPCluster}<br> List Size: {gpprac$PracticeListSize}")) ```
--- # Awesome Markers ``` r leaflet() %>% addTiles() %>% addAwesomeMarkers(data = gpprac, icon = awesomeIcons(markerColor = "red", icon = "glyphicon-plus"), popup = glue::glue("Practice: {gpprac$GPPracticeName}<br> Cluster: {gpprac$GPCluster}<br> List Size: {gpprac$PracticeListSize}")) ```
--- # Customizing colour based on data ``` r pal <- colorFactor(scales::pal_dichromat("Categorical.12")(length(unique(gpprac$GPCluster))), gpprac$GPCluster) leaflet() %>% addTiles() %>% addCircleMarkers(data = gpprac, color = pal(gpprac$GPCluster), radius = 2, fillOpacity = 1, opacity = 1, popup = glue::glue("Practice: {gpprac$GPPracticeName}<br> Cluster: {gpprac$GPCluster}<br> List Size: {gpprac$PracticeListSize}")) ```
--- # Areas We can also add in areas to the map. These can be filled with a colour that we specify manually, or it can be filled using a colour palette. To the previous map showing GP practices, let's add HSCP boundaries coloured by the number of GP practices per 1,000 list size. ``` r HSCP_shp <- read_sf(file.path(geo_dir, "Shapefiles/HSCP 2019/SG_NHS_IntegrationAuthority_2019.shp")) %>% st_transform(crs = 4326) list_rate <- gpprac %>% group_by(HSCP) %>% summarise(Rate = 1000 * n()/sum(PracticeListSize)) %>% right_join(HSCP_shp, ., by = join_by(HIACode == HSCP)) pal_clus <- colorFactor(scales::pal_dichromat("Categorical.12")(length(unique(gpprac$GPCluster))), gpprac$GPCluster) pal_pop <- colorNumeric(phsstyles::phs_colors(c("phs-teal", "phs-magenta")), list_rate$Rate) leaflet() %>% addTiles() %>% # Always add polygons first otherwise you cover up the points addPolygons(data = list_rate, color = ~pal_pop(Rate)) %>% addCircleMarkers(data = gpprac, color = ~pal_clus(GPCluster), radius = 2, fillOpacity = 1, opacity = 1, popup = glue::glue("Practice: {gpprac$GPPracticeName}<br> Cluster: {gpprac$GPCluster}<br> List Size: {gpprac$PracticeListSize}")) ``` --- # Areas
--- # Legends We have to manually add legends to leaflet plots. We do this using `addLegend()` ``` r leaflet(height = "75%") %>% # Altered height to make the legends fit addTiles() %>% # Always add polygons first otherwise you cover up the points addPolygons(data = list_rate, color = ~pal_pop(Rate)) %>% addCircleMarkers(data = gpprac, color = ~pal_clus(GPCluster), radius = 2, fillOpacity = 1, opacity = 1, popup = glue::glue("Practice: {gpprac$GPPracticeName}<br> Cluster: {gpprac$GPCluster}<br> List Size: {gpprac$PracticeListSize}")) %>% addLegend(position = "topright", title = "Practices per 1,000 List Size", pal = pal_pop, values = list_rate$Rate) %>% addLegend(position = "bottomright", title = "GP Cluster", pal = pal_clus, values = gpprac$GPCluster) ``` --- # Legends
--- class: center, middle, inverse3 # Toggling layers --- # Setting Groups One of the great things about Leaflet is the ability to add multiple pieces of information to a single plot, with the option for users to turn off the pieces of information that they don't want to see. Each `add` function (`addMarkers()`, `addPolygons()`, `addLegend()` etc.) has an argument called `group` which allows us to collect everything that relates to a particular group of information. We can then `addLayersControl()` to create a control that allows us to switch groups on and off within the map. As an example we'll combine the polygon and point maps that we've created using the GP data. --- # Adding controls ``` r leaflet() %>% addTiles() %>% # Always add polygons first otherwise you cover up the points addPolygons(data = list_rate, color = ~pal_pop(Rate), `group = "PracticesSize"`) %>% addCircleMarkers(data = gpprac, color = ~pal_clus(GPCluster), radius = 2, fillOpacity = 1, opacity = 1, popup = glue::glue("Practice: {gpprac$GPPracticeName}<br> Cluster: {gpprac$GPCluster}<br> List Size: {gpprac$PracticeListSize}"), `group = "Clusters"`) %>% addLegend(position = "bottomleft", title = "Practices per 1,000 List Size", pal = pal_pop, values = list_rate$Rate, `group = "PracticesSize"`) %>% addLegend(position = "bottomleft", title = "GP Cluster", pal = pal_clus, values = gpprac$GPCluster, `group = "Clusters"`) %>% addLayersControl(position = "topright", options = layersControlOptions(collapsed = FALSE), `baseGroups = c("PracticesSize", "Clusters")`) ``` ---
--- # Group Task The code on the next slide will read in data on dispensing broken down by GP Practice and dispensing location. You should select a health board and then create a map using leaflet that has: - points or markers for dispensing location where the point or marker is coloured by the number of items dispensed - points or markers for GP practice where the point or marker is coloured by the number of items prescribed - Add a toggle to allow practices/dispensing locations to be turned on and off by the user. - locality borders filled based on the proportion of the population in the locality ages 65+ --- # Prescribing/Dispensing Data Set-up ``` r library(readr) library(dplyr) # Link to prescribing data presc_link <- "https://www.opendata.nhs.scot/dataset/4b4be829-c15e-480e-a2fc-996460ff63c6/resource/fa5bbede-475a-4ca9-a71f-3d521657e7c6/download/prescribed-dispensed-annual-2024.csv" # Link to dispenser information data disp_link <- "https://www.opendata.nhs.scot/dataset/a30fde16-1226-49b3-b13d-eb90e39c2058/resource/b287dcbe-8026-4c0b-9838-9f815be22e62/download/dispenser_contactdetails_mar_2024.csv" # Link to GP practice information GP_link <- "https://www.opendata.nhs.scot/dataset/f23655c3-6e23-4103-a511-a80d998adb90/resource/54a6e1e3-98a3-4e78-be0d-1e6d6ebdde1d/download/practice_contactdetails_jan2024-open-data.csv" prescribed <- read_csv(presc_link) gp_lookup <- read_csv(GP_link) disp_lookup <- read_csv(disp_link) gp_presc <- prescribed %>% group_by(PrescriberLocation) %>% summarise(Items = sum(NumberOfPaidItems)) %>% mutate(PrescriberLocation = as.numeric(PrescriberLocation)) %>% right_join(gp_lookup, by = join_by(PrescriberLocation == PracticeCode)) disp_presc <- prescribed %>% group_by(DispenserLocation) %>% summarise(Items = sum(NumberOfPaidItems)) %>% right_join(disp_lookup, by = join_by(DispenserLocation == DispCode)) %>% mutate(Items = if_else(is.na(Items), 0, Items)) ``` --- # Population Data Set-up ``` r locality_lookup <- read_csv("/conf/linkage/output/lookups/Unicode/Geography/HSCP Locality/HSCP Localities_DZ11_Lookup_20240513.csv") pops <- read_csv("/conf/linkage/output/lookups/Unicode/Populations/Estimates/DataZone2011_pop_est_2011_2021.csv") %>% filter(year == max(year)) %>% select(datazone2011, age65:age90plus, total_pop) %>% left_join(locality_lookup) %>% group_by(hscp_locality) %>% summarise(across(c(starts_with("age"), total_pop), sum)) %>% mutate(prop65plus = rowSums(across(starts_with("age")))/total_pop, .keep = "unused") ```